Revolutionary Database Technology for Data Intensive Research

نویسندگان

  • Martin L. Kersten
  • Stefan Manegold
چکیده

The heart of a scientific data warehouse is its database system, running on a modern distributed platform, and used for both direct interaction with data gathered from experimental devices and management of the derived knowledge using workflow software. However, most (commercial) DBMS offerings cannot fulfill the demanding needs of scientific data management. They fall short in one or more of the following areas: multi-paradigm data models (including support for arrays), transparent data ingestion from, and seamless integration of, scientific file repositories, complex event processing, and provenance. These topics only scratch the surface of the problem. The state of the art in scientific data exploration can be compared with our daily use of search engines. For a large part, search engines rely on guiding the user from their ill-phrased queries through successive refinement to the information of interest. Limited a priori knowledge is required. The sample answers returned provide guidance to drill down, chasing individual links, or to adjust the query terms. The situation in scientific databases is more cumbersome than searching for text, because they often contain complex observational data, eg telescope images of the sky, satellite images of the earth, time series or seismograms, and little a priori knowledge exists. The prime challenge is to find models that capture the essence of this data at both a macro-and micro-scale. The answer is in the database , but the 'Nobel-winning query' is still unknown. Next generation database management engines should provide a much richer repertoire and ease of use experience to cope with the deluge of observational data in a resource-limited setting. Good is good enough as an answer, provided the journey can be continued as long as the user remains interested. We envision seven directions of long term research in database technology: • Data Vaults. Scientific data is usually available in self-descriptive file formats as produced by advanced scientific instruments. The need to convert these formats into relational tables and to explicitly load all data into the DBMS forms a major hurdle for database supported scientific data analysis. Instead, we propose the data vault, a database-attached external file repository. The data vault creates a true symbiosis between a DBMS and existing file-based repositories, and thus provides transparent access to all data kept in the repository through the DBMS's (array-based) query language. • Array support. Scientific data management calls for DBMSs that integrate the genuine scientific data model, multi-dimensional arrays, …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Next Generation Database Management Systems Technology

Database management systems (DBMS) technology is advancing in two directions. The evolutionary or parochial direction raises questions such as: How can DBMS technology meet current new application requirements? What features might it offer users? The revolutionary direction raises questions such as: What requirements will the next generation of computing, applications, and users place on DBMS t...

متن کامل

Design and Implementation of a Comprehensive Database of the Written Heritage of Science and Technology

Purpose: This study aims to design and implement a comprehensive database of the written heritage of science and technology in the Regional Information Center for Science and Technology (RICeST) and determine the metadata elements required to describe the manuscripts. Method: This study was carried out by the content analysis method to identify the metadata elements needed to describe the coll...

متن کامل

On the Construction of Mobile Database Management Systems

Managing the infoglut in this information age requires more than the traditional and xed computing environment in order to keep up with fast changing technologie. Cellular and mobile communications ooer exibility in accessing and manipulating information without restricting users to speciic locations. While this technology is revolutionary in its approach to managing data, many issues remain to...

متن کامل

Cache Management Issues in Mobile Computing Environment

Mobile computing is a revolutionary technology which enables us to access information, anytime and anywhere. Recently, there has been many research area is into mobile computing. Caching techniques reduce bandwidth consumption and data access delay .In this paper, we have discussed about different impact that mobile computing has had in the area of data management. In wireless communication the...

متن کامل

Towards P2P XML Database Technology

To ease the development of data-intensive P2P applications, we envision a P2P XML Database Management System (P2P XDBMS) that acts as a database middle-ware, providing a uniform database abstraction on top of a dynamic set of distributed data sources. In this PhD work, we research which features such a database abstraction should offer and how it can be realised efficiently by extending and com...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • ERCIM News

دوره 2012  شماره 

صفحات  -

تاریخ انتشار 2012